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Cross Reference to Related Applications 
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The present invention claims priority from U.S. 
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Provisional Patent Application Ser. No. 60/157,711 filed on 
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October 5, 1999, the entire disclosure of which is 
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incorporated herein by reference. 
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BACKGROUND OF THE INVENTION 

\2 

11 . 

1. Field of the Invention 

jU 

12 

The present invention relates generally to 

sea 

in 

13 

conferencing systems, and more particularly to a 


14 

videoconferencing apparatus for use with multi-point 

o 

15 

conferences . 

Li s 
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2 . Background of the Prior Art 


18 

Videoconferencing systems have become an increasingly 


19 

popular and valuable business communications tool. These 


20 

systems facilitate rich and natural communication between 


21 

persons or groups of persons located remotely from each 


22 

other, and reduce the need for expensive and time-consuming 


23 

business travel. 
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1 At times , it may be desirable to conduct multi-point 

2 conferences, wherein three or more parties (each party 

3 consisting of an individual or group located at a 

4 particular conference endpoint) participate in the 

5 conference- Multi-point conferences are particularly 

6 useful in situations where several interested parties need 

7 to participate in the resolution of an issue, or where 

8 information is to be disseminated on an enterprise-wide 

9 level. However, commercially available video conferencing 
rg 10 systems are generally capable of communicating with only 


i > 
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in 11 one other conference endpoint at a time. To conduct multi- 

*2 12 point conferences, the conference endpoints are 

« * 13 conventionally interconnected through an external piece of 

14 equipment called a multi-point control unit (MCU) . The MCU 

15 is provided with multiple ports for receiving signals 

16 representative of audio and video information generated at 

17 each of the conference endpoints. The received signals are 

18 mixed and/or switched as appropriate, and the 

19 mixed/switched signals are subsequently transmitted to each 

20 of the conference endpoints. 

21 A significant disadvantage associated with the use of 

22 MCUs is their expense. An enterprise wishing to conduct 

23 multi-point conferences must either purchase a MCU, which 

24 may cost upwards of $50,000, or contract for "video bridge" 
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services through a telephone company, wherein an MCU 
located at the telephone company's facilities is rented on 
a f ee per unit of usage basis. In either case, . the high 
cost of purchasing or renting an MCU may dissuade a company 
from conducting multi-point conferences, even when it would 
be useful to do so. 

Conventional MCUs further require a dedicated Inverse 
Multiplexer (IMUX) for each endpoint of a multi-point 
conference." These dedicated IMUXs are hardware devices 
which must be purchased and installed at additional cost to 
achieve increased endpoint capability. 

Finally, conventional MCUs include hard-wired 
processing units each having a dedicated set of channels 
associated therewith. Thus, unused channels associated 
with a processing unit are unavailable for allocation to 
additional endpoints . 

What is therefore needed in the art is a relatively 
low-cost videoconferencing apparatus which can dynamically 
allocate unused channels on an as needed basis. 



1 SUMMARY OF THE INVENTION 

2 The present invention is directed to a multi-point 

3 (MP) conferencing application having dynamically allocable 

4 software-based IMUX functions. The IMUX functions are 

5 implemented in a software-based circuit switch operable to 

6 aggregate a plurality of processing trains to a wideband 

7 serial data stream. The IMUX functions are created on an 

8 as needed basis for each endpoint in a multi-point 

9 conference. 

10 The MP conferencing application is coupled to a 

11 conventional network interface including a time division 

12 multiplexer. The time division multiplexer is in turn 

13 coupled to a plurality of communication ports, which may 

14 typically include ISDN ports, enabling an apparatus 

15 including the MP conferencing application to be coupled to 

16 two or more remote conference endpoints through a switched 

17 network. 

18 The (MP) conferencing application is operable to 

19 process the plural signal streams received through the 

20 communication ports. Generally, the MP conferencing 

21 application generates separate processing trains for signal 

22 streams from/to each of the remote conference endpoints. 

23 The processing trains each comprise a communication process 

24 and a set of cociecs . In the receive mode, an IMUX function 
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1 combines signal streams (representative of a single 

2 conference endpoint) distributed over two or more channels 

3 into a single, relatively high bandwidth channel. The 

4 communication process, which may for example comprise an 

5 H.320 process -(ISDN-based) or H.323 (packet-based) process, 

6 separates the signal stream into audio and video signals, 

7 and performs certain processing operations (such as delay 

8 compensation) associated therewith. The audio and video 

9 signals are thereafter respectively delivered to audio and 

10 video codecs for decoding. 

11 The decoded audio and video streams output by each of 

12 the processing trains, together with the locally generated 

13 audio and video signals, are combined at an audio mixer and 

14 a video switching/continuous presence module. The video 

15 module may be configured to selectively generate as output 

16 video data representative of a composite or continuous 

17 presence image, wherein video information (e.g., images of 

18 the conference participants) corresponding to each of the 

19 conference endpoints is displayed in different sectors of 

20 the screen. The combined audio and video data streams are 

21 conveyed as input to each processing train for encoding and 

22 transmission to the corresponding conference endpoints. In 

23 the send mode, the audio and video signals are encoded by 

24 the audio/video codecs and multiplexed into a single data 
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1 stream by the communication process. The combined 

2 audio/video data stream is then conveyed to the IMUX 

3 function, which distributes the combined audio/video data 

4 stream over the channels associated with the selected 

5 remote conference endpoint. 
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BRIEF DESCRIPTION OF THE FIGURES 
FIG. 1 depicts a near videoconferencing endpoint 

interconnected with two remote videoconferencing endpoints, 

the near videoconferencing endpoint having integrated 

multi-point conferencing capabilities; 

FIG. 2 is a block diagram of the near conferencing 

endpoint; 

FIG. 3 is a block diagram of a multi-point 
conferencing application of FIG. 2; 

FIG. 4 is a block diagram of an exemplary signal 
processing train of FIG. 3; and 

FIG. 5 is a block diagram of an exemplary network 
interface . 


1 DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

2 FIG. 1 depicts an exemplary operating environment of 

3 the multi-point (MP) conferencing application of the 

4 present invention. A near conference endpoint 100, 

5 embodying the MP conferencing application, is coupled to 

6 remote conference endpoints 102 and 104 via a network 106. 

7 Remote conference endpoints 102 and 104 may comprise, for 

8 example, conventional videoconferencing devices equipped to 

9 transmit and receive both video (image) data and audio 
% 10 (speech) data. Alternatively, one or more of remote 

ffj 11 conference endpoints 102 and 104 may comprise conventional 

jsi 12 audio conferencing devices limited to reception and 

s; at 

In 13 transmission of audio data. It should be appreciated that 

!■* 14 while only two remote conference endpoints are depicted in 

?3 15 FIG.l for the purpose of clarity, a greater number of 

it 16 remote conference endpoints may be accommodated by near 

17 conference endpoint 100. 

18 Network 106 may be of any type suitable for the 

19 transmission of audio and video data between and among near 

20 . ^conference endpoint 100 and remote conference endpoints 102 

21 and 104. Typically, network 106 will comprise the public 

22 switched telephone network (PSTN) or comparable circuit 

23 switched network to which each of the conference endpoints 

24 is connected by one or more ISDN lines. A multi-point 
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1 conference is initiated by establishing a connection 

2 between near conference endpoint 100 and remote conference 

3 endpoint 102, and between near conference endpoint 100 and 

4 remote conference endpoint 104. Establishment of the 

5 connections may be effected through a dial-up procedure, or 

6 through use of a dedicated line. 

7 Alternatively, network 106 may comprise a packet 

8 switched network, such as the Internet. Although a single 

9 network 106 is shown, the invention contemplates the use of 

10 two or more networks (for example, the PSTN and the 

11 Internet) to connect conference endpoints utilizing 

12 different communication protocols. 

" 13 Reference is now directed to FIG. 2, which depicts in 

l± 14 block form various components of near conference endpoint 

o 

n 15 100. A conventional video camera 202 and microphone 204 

in 

16 are operative to generate video and audio signals 

17 representative of the images and speech of the near 

18 conference participant (the person or persons co-located 

19 with near videoconf erence endpoint 100) . A video monitor 

20 208 and loudspeaker 210 present images and speech of the 

21 remote conference participants combined with locally 

22 generated images and speech. An audio I/O interface 212, 

23 configured to perform A/D and D/A conversion and related 

24 processing of audio signals, couples microphone 204 and 


^2* 
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1 loudspeaker 210 to CPU 220 and memory 222 through bus 226. 

2 Similarly, video camera 202 and monitor 208 are coupled to 

3 console electronics 213 through video I/O interface 214. 

4 Console electronics 213 additionally include a central 

5 processing unit (CPU) 220 for executing program 

6 instructions, a memory 222 for storing applications , data, 

7 and other information, and a network interface 224 for 

8 connecting near conference endpoint 100 to network 106. 

9 Memory 222 may variously comprise one or a combination of 

10 volatile or non-volatile memories, such as random access 

11 memory (RAM), read-only memory (ROM), programmable ROM 

12 (PROM) , or non-volatile storage media such as hard disks or 

13 CD-ROMs. At least one bus 226 interconnects the components 

14 of console electronics 213. 

15 Network interface 224 is provided with a plurality of 

16 ports for physically coupling near conference endpoint 100 

17 to a corresponding plurality of ISDN lines 240-246 or 

18 similar transmission media. The number of ports will be 

19 determined by the types of connections to network 106, the 

20 maximum number of remote conference endpoints which may be 

21 accommodated by videoconf erence endpoint 100, and the 

22 required or desired bandwidth per endpoint connection. 

23 Depending on bandwidth requirements, data communicated 

24 between near conference endpoint 100 and a remote 
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conference endpoint may be carried on a single ISDN line, 
or may be distributed (for higher bandwidth connections) 
among a plurality of ISDN lines. 

Stored within memory 222 are an operating system 230, 
a call manager application 232, and the MP conferencing 
application 234. Operating system 230 controls the 
allocation and usage of hardware resources, such as CPU 220 
and memory 222. Call manager application 232 controls the 
establishment and termination of connections between near 
conferencing endpoint 100 and remote conference endpoints 
102 and 104, and may also furnish information 
characterizing the nature of individual connections to MP 
conferencing application 234. 

As will be described in further detail below, MP 
conferencing application 234 is configured to instantiate a 
processing train for each remote conference endpoint 102 
and 104 to which near conference endpoint 100 is connected. 
The processing trains process audio and video data streams 
received from remote conferencing endpoints 102 and 104. 
The processed audio and video data streams are combined 
with each other and with locally generated audio and video 
streams, and the combined audio and video streams are 
thereafter distributed to remote conferencing endpoints 102 
and 104. 
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FIG. 3 is a block diagram showing the various 


2 

components of an embodiment of MP conferencing application 


3 

234 and the flow of data between and among the various 


4 

components. MP conferencing application 234 includes a 


5 

circuit switch 350, a plurality of processing trains 302 


6 

and 304, a video switching/continuous presence module 306, 


7 
8 

and an audio mixing module 308 . The circuit switch 350 
dynamically instantiates a number of high bandwidth 


9 

processing trains equal to the number of remote conference 

* -J! 

10 

endpoints to which near conference endpoint 100 is 

Cn 
C8 

11 

connected and preferably includes an dynamically created 


12 

IMUX allocated to each remote conference endpoint. Each 

m 

13 

IMUX preferably utilizes a bonding protocol. In the 


14 

example depicted in the figures, the circuit switch 350 

□ 

15 

dynamically allocates two IMUXs and generates two 

r "n 

16 

processing trains 302 and 304 respectively corresponding to 


17 

remote conference endpoints 102 and 104. 


18 

Processing trains 302 and 304 preferably comprise 


19 

software routines which process received and transmitted 


20 

audio and video signals in accordance with predetermined 


21 

algorithms. In the receive mode, processing train 302 is 


22 

instantiated by circuit switch 350 to include signals 


23 

representative of audio and video data transmitted by 


24 

remote conference endpoint 102 . Illustratively, remote 
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conference endpoint 102 may transmit signals on ISDN lines f 
each ISDN line comprising two distinct 64 Kb/sec bi- 
directional channels ("Bearer channels") . Those skilled in 
the art will recognize that a smaller or greater number of 
ISDN lines may be utilized for communication with remote 
conference endpoint 102. As will be described in 
connection with FIG. 4, processing train 302 is operative 
to extract and decode audio and video data from signals 
received from remote conference endpoint 102. Decoded 
audio data is conveyed to audio mixing module 308 over 
audio data path 352, and decoded video data is conveyed to 
video switching/continuous presence module 306 over video 
data path 354 . 

Processing train 304 similarly receives audio and 
video data transmitted by remote conference endpoint 104. 
Processing train 304 extracts and decodes the audio and 
video data and subsequently passes the decoded audio and 
video data to audio mixing module 308 and video 
switching/continuous presence module 306 over audio and 
video data paths 370 and 372. 

Audio mixing module 308 is configured to combine audio 
data received from remote conference endpoints 102 and 104 
with locally generated audio data (received from audio I/O 
interface 212 via audio data path 374, and typically being 
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1 representative of the speech of the near conference 

2 participant (s) ) . The term "combine" is used in its 

3 broadest and most general sense and is intended to cover 

4 any operation wherein audio mixing module 308 generates an 

5 output audio data stream (or plurality of output audio data 

6 streams) based on information contained in the remotely and 

7 locally generated audio data input streams. For example, 

8 audio mixing module 308 may simply mix the received audio 

9 input data streams, or it may be configured as an audio 

□ 10 switch wherein it selects one of the received audio input 

fn 11 data streams for output in accordance with predetermined 

12 criteria. The output audio data stream is directed to 

13 processing trains 302 and 304 and audio I/O interface 212 

14 along output audio paths 376, 378 and 380. 

15 Video switching/continuous presence module 306 

16 combines video data received from remote conference 

17 endpoints 102 and 104 with locally generated video data 

18 (received from video I/O interface 214 via video data path 

19 382, and being typically representative of images of the 

20 near, conference participants). Again, the term "combine" 

21 is used in its broadest and most general sense. In one 

22 mode of operation, video switching/continuous presence 

23 module 30 6 may select one of the video data input streams 

24 for output based on predetermined criteria (for example, it 
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1 

may select for output the video data stream corresponding 


2 

to the conference endpoint of the currently speaking 


3 

participants. In a second mode of operation (referred to 


4 

as the "continuous presence mode"), video 


5 

switching/continuous presence module 306 may construct a 


6 

composite image wherein images corresponding to conference 


7 

endpoints are displayed xn different sectors of the 


8 

composite image. The video data stream output (or 


9 

plurality of outputs) from video switching continuous 

fl 
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£n 

10 

presence module 306 is thereafter distributed to processing 

11 

trains 302 and 304 and video I/O interface 214 via video 

CO 

%. 

12 

data paths 390, 392 and 394. 
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In the transmission mode, processing train 302 is 


14 

configured to receive the audio and video data streams 

5. -S 

?«i 

15 

output by audio mixing module 308 and video 

in 

r *i 

16 

switching/continuous presence module 306. The received 

□ 

17 

data streams are then encoded and combined to form a mixed 


18 

encoded audio/video data stream, and the encoded 


19 

audio/video data stream is transmitted to the circuit 


20 

switch 350 via data path 344. Similarly, processing train 


21 

304 receives the audio and video streams output by audio 


22 

mixing module 308 and video switching/continuous presence 


23 

module 306, encodes and combines the audio and video data 


24 

streams, and transmits the encoded audio/video data stream 
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1 to the circuit switch 350 via data path 346. For each 

2 encoded audio/video data stream, the circuit switch 350 

3 allocates an IMUX which aggregates the data streams into a 

4 wideband data stream on the bus 226, preferably utilizing a 

5 bonding protocol. 

6 FIG. 4 depicts components of an exemplary processing 

7 train 302. Processing train 302 includes a communication 

8 process 404 and video and audio codecs 406 and 408. In the 

9 receive mode, the combined data stream 344 is directed to 

10 communication process 404 which carries out a predetermined 

11 set of functions with respect to data stream 344. 

12 According to one embodiment of the invention, 

13 communication process 404 implements the multiplexing, 

14 delay compensation and signaling functions set forth in ITU 

15 Recommendation H.320 ("Narrow-Band Visual Telephone Systems 

16 and Terminal Equipment"). In particular, communication 

17 process 404 includes a multiplexer/demultiplexer for (in 

18 the receive mode) extracting separate audio and video 

19 signals from mixed data stream 344 in accordance with ITU 

20 Recommendation H.221. Communication process 404 may 

21 further include a delay compensation process for inducing a 

22 delay in the audio data path in order to maintain lip 

23 synchronization. A system control unit is incorporated 

24 into communication process 404 and is configured to 
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1 establish a common mode of operation with remote conference 

2 endpoint 102 in accordance with ITU Recommendation H.242. 

3 Audio codec 408 receives the audio data stream from 

4 communication process 404 and applies redundancy reduction 

5 decoding in accordance with a standard (e.g., ITU 

6 Recommendation G.711) or proprietary audio compression 

7 algorithm. The decoded audio data stream is then sent to 

8 audio mixing module 308, as described above. Similarly, 

9 video codec 406 receives the video data stream and applies 
C3 10 redundancy reduction decoding in accordance with a standard 
CH- ii (e.g., ITU Recommendation H.261) or proprietary video 

*~ 12 compression algorithm. The decoded video data stream is 

13 subsequently sent to video switching/continuous presence 

14 module 306 for combination with video data generated by 

15 remote conference endpoint 104 and near conference endpoint 

H 16 100, as described above in connection with FIG. 3. 

C3 

17 In the transmit mode, video codec 406 encodes the 

18 video data stream output by video switching/continuous 

19 presence module 306 (representative, for example, of a 

20 "continuous presence" image) using a standard or 

21 proprietary video compression algorithm (e.g., H.261) and 

22 delivers the encoded video data to communication process 

23 404. Audio codec 408 encodes the audio data stream output 

24 by audio mixing module 308 (representative, for example, of 
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1 the blended speech of conference participants located at 

2 near conference endpoint 100 and remote conference 

3 endpoints 102 and 104) using a standard or proprietary 

4 audio compression algorithm (e.g., G.711) and delivers the 

5 encoded audio data to communication process 404. 

6 Communication process 404 multiplexes the encoded 

7 audio and video data streams into a single audio/video data 

8 stream 344 of relatively high bandwidth. The audio/video 

9 data stream is conveyed to circuit switch 350, which breaks 
;3 10 up and distributes the high-bandwidth audio/video data 

Ts 11 signal over, plural ISDN channels as further described 

= St 

12 hereinbelow. 

L 

r "~ 

= = 13 It is noted that, while not depicted in the Figures, 

U i 

14 processing train 302 may include a data codec for coding 

H 

~i 15 and encoding still images and the like received from or 

^ 16 transmitted to remote conference endpoints 102 and 104. 

17 With reference to FIG. 5 the network interface 224 

18 includes a time division multiplexer 502 which receives the 

19 wideband data stream 226 from the circuit switch 350. The 

20 time division multiplexer 502 is coupled to a plurality of 

21 ISDN ports 504 for receiving and transmitting signals on 

22 lines 240, 242, 244, and 246. 

23 The present invention advantageously utilizes 

24 software-based processing of video and audio data streams 
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1 to implement a multi-point conferencing capability in a 

2 conference eridpoint. By dynamically generating a separate 

3 instance of a processing train for each remote endpoint 

4 session, a videoconferencing system embodying the invention 

5 may easily and flexibly accommodate endpoint sessions 

6 comprising a range of connection bandwidths and 

7 communication protocols. Other advantages will occur to 

8 those of ordinary skill upon review of the foregoing 

9 description and the associated figures. 

CJ 10 It is to be understood that the detailed description 

11 set forth above is provided by way of example only. 

" 12 Various details of design, implementation or mode of 

if ^ 

^Z. 13 operation may be modified without departing from the true 

14 spirit and scope of the invention, which is not limited to 

r ™s 

12 15 the preferred embodiments discussed in the description, but 

□ 16 instead is set forth in the following claims. 
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